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the velocity of sound. It must not be supposed that this represents the real 
accuracy of the results, especially at velocity ratios above 1*1 or 1*2, where 
the effect of the zero point may be serious. It does seem, however, possible 
to claim that the tables and curves of this paper give values of these coeffi- 
cients of the true order of magnitude, and show correctly the type of change 
that takes place as the velocity of the shell passes through the velocity of 
sound. 



An Experimental Determination of the Distribution of the 
Partial Correlation Coefficient in Samples of Thirty. 

By Captain J. W. Bispham, E.E. 
(Communicated by Prof. C. J Martin, F.RS. Eeceived November 26, 1919.) 

Part I. — Samples from an Uncorrelatecl Universe. 

1. Introductory: 

In 1908* " Student " dealt experimentally with the distribution of the 
total correlation coefficient of small samples. In particular, he dealt with 
values of n as low as 4 for the case of zero correlation in the sampled 
population. In 1913 H. E. Soperf theoretically determined the mean correla- 
tion and the standard deviation of the distribution of correlations to second 
approximations. In 1915 E. A. Fisher J gave an equation for the frequency 
distribution of r, and in 1917, as a result of a co-operative study by 
H. E. Soper, A. W. Young, B. M. Cave, A. Lee and K. Pearson^ this was 
reduced to suitable form for numerical manipulation, and the frequency 
distributions and frequency constants for samples of size ranging from n •=■ 3 
to n = 400 were given for values of the correlation in the sampled population 
ranging from p = to p = 0*9. The present experimental investigation was 
commenced in 1914, but had to be put aside during the war. It was intended 
to determine whether the distribution of partial correlation coefficients for 
samples as small as 30 showed greater dispersion than is observed for total 
correlation coefficients. Yule|| has shown that for normal distributions and 
large samples the standard deviations of the distributions should be of the 

* ' Biometrika,' vol. 6, p. 302, et seq. 

t ' Biometrika/ vol. 9, p. 91, et seq. 

\ l Biometrika,' vol. 10, p. 507, et seq. 

§ ' Biometrika,' vol. 11, p. 328, et seq. 

|| G. U. Yule, 'Roy. Soc! Proc./ A, vol. 79 (1907). 
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same magnitude. The experiment can now be related to the complete 
evaluation of the distributions of total correlations referred to above.* 



2. Nature of Sampled Population. 

The population to be sampled was obtained artificially as follows: Thirty 
" counters" bearing the numbers +1 to +15 and —1 to —15 were drawn in 
random order from a container and the numbers were written down in the 
order drawn. This set of thirty numbers represents the varying values of the 
first attribute. The values of the second attribute were obtained in exactly 
the same way, viz., the counters were returned to the container, shaken up, 
and drawn again in random order, the numbers drawn being written down 
beside the first set of thirty. Similarly, the varying values of the third 
attribute were obtained by a third draw. This artificial population has the 
advantage that the mean value of each attribute in each sample is zero, and 
further, the standard deviations are invariable from sample to sample. This 
saved a good deal of arithmetical work. 

The frequency distribution of the attributes in the artificial population is 
very far from normal, being of rectangular form with a central strip missing, 
since any of the numbers except zero occurs as often as any other, while the 
number zero does not appear at all. A further advantage of so irregular a 
distribution lay in the fact that the investigation was intended, in part, to 
test the reliability of partial correlation coefficients obtained in respect of 
groups of registration districts used in a previous investigation, and in such 
cases the frequency distribution is, of course, irregular. 

3. Observed Distributions of Total and Partial Correlation Coefficients. 

The values of the total correlation coefficient between the various pairs of 
attributes will be referred to generally as r 12} r m and r 23 . The partial correla- 
tion coefficient determined was that between attributes 1 and 2 for 3 constant. 
The notation 37*12 will be used. 

In all, 1000 cases were worked and the observed distributions are shown 
in detail in Table I. 

The distribution given in column 6 is that deduced from the investigation 
referred to above. E. A. Eisher'sf equation for the distribution is shown to 
lead to 

* ' Biometrika,' vol. 11, p. 328, et seq. 
t * Co-operative Study,' loc. cit. 
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This is a very convenient form for the evaluation of the distributions for 
successive values of n. 



Table I. — Total and Partial Correlation Coefficients : 
Frequency distribution for 1000 samples where n = 30 and p = in each case. 



(1) 

Range of 
r. 


(2) 
Observed 

r \2> 


(3) 
Observed 


(4) 
Observed 

*23- 


(5) 
Observed 

3 r 12« 


(6) 

Theoretical.! 


+ 0*575 to +0-625 


1 


1 








+ 0-525 +0*575 


1 


2 


2 


2 


1-7* 


+ 0-475 +0-525 








3 


1 


2-5 


+ 0-425 +0:475 


6 


9 


5 


6 


5-4 


+ -375 + -425 


9 


8 


11 


7 


11 -o 


+ 0-325 +0-375 


19 


16 


22 


23 


19-3 


; +0-275 +0-325 


27 


35 


29 


35 


30-8 


+ 0-225 +0-275 


43 


40 


44 


39 


45-3 


• +0-175 +0*225 


69 


53 


63 


70 


615 


+ 0*125 +0-175 


64 


69 


90 


73 


77-7 


+ -075 + -125 


105 


96 


90 


91 


91-6 


f +0-025 +0-075 


96 


98 


96 


87 


1.01 -o 


-0-025 to +0-025 


112 


120 


104 
110 


118 
107 


104-3 


, . — ___. 

1 
-0-025 to -0-075 


108 


114 


101-0 


-0-075 -0-125 ; 


98 


. 102 


96 


90 


91-6 


-0-125 -0 175 


65 


84 


76 


61 


77-7 


-0-175 -0-225 


54 


50 


64 


60 


61-5 


-0-225 -0-275 


46 


48 


46 


51 


45-3 


-0-275 -0-325 


42 


27 


17 


43 


30-8 


-0-325 -0-375 ! 


11 


15 


12 


14 


19-3 


-0-375 -0-425 


11 


6 


12 


8 


11-0 


-0-425 -0-475 


6 


3 


4 


11 


5-4 


-0-475 -0*525 


3 


2 


1 


1 


2-5 


-0-525 -0-575 


3 


2 


3 


2 


1-7* 


-0-575 -0-625 













-0-625 -0 675 


1 










Total 


1000 


1000 


1000 


1000 


1000 





* These are frequencies for the whole range to the limits of the distribution, 
f ' Co-operative Study,' loc. cit. 

In the paper in question the ordinates are given for n = 25 and n = 50. 
The ordinates for n = 30 were calculated by the method indicated above, 
and checked to six significant figures by the alternative method given on 
p. 332. The intermediate ordinates forming the beginnings and ends of the 
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frequency groups were calculated by the latter method, and Simpson's 
formula was used for finding the areas, viz., 



o 



ydx = -z {yo-f%+2/i}' 



6 



An inspection of Table I leads one to expect a reasonably good fit in all 
cases. Before referring to the determination of " goodness of fit," it may be 
well to give particulars of the frequency constants of the observed distribu- 
tions and of those calculated from the theoretical investigation.* These are 
given below : — 

Table II. — Values of Frequency Constants of Distributions in Table I. 



(1) 


(2) 


i 

i (3) 


(4) 


(5) 


(6) 


Constant. 


ns- 


; *'i3- 


2*23- 


3**12- 


Theoretical. 


Mean 


-0*0029 


+ -0008 


+ 0-0077 


-0-0021 


-oooo 


Ms 


-03396 


-03148 


-03287 


-03526 


-03448 


ffr 


-1843 


-1774 


-1813 


-1878 


'1857 


M3 


-0*000578 


+ 0-000664 


+ 0-000099 


-0-000322 


o -oooo 


\H 


-003455 


-003004 


-003152 


-003545 


-003346 


ft. 


-00851 


-01413 


-00028 


-00236 


-ooooo 


t 9 


2 '9948 


3-0311 


2 -9158 


2-85036 


2 -81362 




'1857 


0-1857 


-1857 


0-1857 


-1857 



On p. 371 of the paper in question, it is stated that the condition 
fii = 0. /S2 = 3 for Gaussian distribution is not attained for samples of 
25 or 50. The distributions considered, however, are concerned with values 
of p as high as —9; and, for high values of p, the distributions of r are 
markedly skew. In the case now under consideration, however, the dis- 
tribution is symmetrical, and an examination of Table II shows that the 
departure from the Gaussian type is not large. In fact, if the theoretical 
distribution be compared with the normal curve that fits it best — that is, the 
normal curve with the same standard deviation — it is found that, grouping 
the frequencies as below, the "goodness of fit," as measured by P, is 
0*98 ±0*08. This expresses in another way that, for the special case p == 
and n = 30, the distribution of r's approaches closely to the normal, and, 
consequently, the standard deviation may be regarded as a good measure of 
the dispersion. The tabulated values of cr r are subject to a probable error 
+ 0*003, and it may be noted that they do not depart significantly from the 



value obtained from the usual expression 
to five places with the theoretical value. 



\/n— 1 



, which is itself identical 



* ' Co-operative Study, 5 loc. cit. 
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4ta. Curve Fitting : Total Correlations. 

In the following Table III will be found the frequency distributions of 
Table I on 0*15 ranges centred about the tabulated value, together with the 
corresponding best fitting normal curve frequencies and the theoretical 
frequencies. 

Table III refers to the three sets of 1000 total correlations. The 
frequency distributions of the best fitting normal curves to the observed 
distributions are referred to as follows : — 

N"i2 — best fitting normal curve for i\% 
N13— best fitting normal curve for r\%. 
N23 — best fitting normal curve for r 23 . 
E"t— normal curve based on the theoretical value of <r r as determined 

from the equations of the co-operative study. 
T —theoretical distribution. 



Table III. — Distribution of Total Correlation Coefficients. 



Value of 

r. 


(i) 


(2) 


(3) 


(4) 


(5) 


(6) 


(?) 


(8) ! 
T. 


+ -60, efcc. 
+ 0*45 
+ 0-30 
+ 0-15 


2 

15 

89 

238 


2-1 
19 

87*8 
228*0 


3 

17 
91 

218 


1-6 

15-9 

85-6 

234-7 


2 

19 

95 

243 


2*3 

19-2 

93-9 

239-9 


2-3 

19-4 

91*0 

230-3 


2 2 

18-5 
95-3 

230-8 





316 


316-1 

223*8 

92-5 

18-5 

2-1 


332 


327*6 


310 


320-9 


313-8 


306-4 


-0*15 
-0-30 
-0 '45 
-0*60, etc. 


217 

99 

20 

4 


236 
90 
11 

2 


233-0 

84-4 

15-6 

1-6 


236 

75 
17 

3 


224-4 
78-7 

15*8 
2*3 


230*3 

91-0 

19-4 

2 3 


230-8 

95-3 

18'5 

2-2 


Values of P 


p =^o- 


77 ± 0-29 


p = o- 


78 ±0-29 


P = 0*988 ±0-06 


P = 


0-98 



Values of P are given at the foot of the columns as paired. Comparing the 
values of the actual distributions columns (1), (3) and (5) with the theoretical 
distribution given in column (8) we get for the various values of P. 

r 12 with T, P = 0-83 ±0-27* 
ri3 with T, P = 0-64 + 0-31, 
ras withT, P = 0'70±0.30. 

The value of P as between columns (7) and (8), i.e., between the normal 

* The probable error of P is rather large, as has been shown by Prof. K. Pearson, 
' Phil. Mag.,' vol. 31, April, 1916. 
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curve based on the theoretical value of o> and the actual theoretical distribu- 
tion is as stated above 0*98 + 0*08. 

45. Curve Fittings : Partial Correlations. 

The ensuing Table IV gives the corresponding frequencies of the partial 
correlations as observed, together with the corresponding values of, the best 
fitting normal curve to the distribution, the best fitting Pearson curve type 2 
and the values of N T and T as used above. In each case the values of P are 

* 

given at the foot of the columns. 



Table IV. — Frequency Distribution of Partial Correlation Coefficients. 



Value of 

3**12- 


(1) 

Observed 
distribution. 


(2) 


(3) 
Pearson 

curve, 

Type II. 


(4) 


(5) 
T. 


+ '60, etc. 
+ 0-45 
+ 0-30 
+ 0-15 


2 

14 

97 

234 


I 
2-5 i 1-3 ! 2-3 

19 '8 ; 19 '8 19 '4 

90-9 ! 94-3 91-0 

227*4 | 228-4 230*3 


2-2 

18-5 

95-3 

230-8 





312 


310-4 


305-6 


313*8 


306 -4 


-0-15 
-0-30 
-0-45 
-0-60, etc. 


211 

108 

20 

2 


231*3 

94 1 

20-8 

2-7 


232-1 

97*5 

20*8 

1*3 


230 -3 

91-0 

19-4 

2-3 


230 '8 
95 3 

18-5 

2-2 


Total 


1000 


1000 


1000 


1000 


1000 








P=0-60±0-31 


P = 0'66db0-31 


P=0'56±0-30 


P=0'77dfe0-29 



5. Summary. 

It will be seen that the distribution of partial correlation coefficients is, on 
the whole, as well fitted by the theoretical distribution as are the distributions 
of total correlations. Comparing the four observed distributions with the 
distribution referred fco as T, we have — 



For distribution of ri 2 , P 
For distribution of ri 3 , P 
For distribution of r 23 , P 
For distribution of 3 ri 2 , P 



0*83 + 0*27, 
0*64 + 0*31, 
0*70 ±0*30, 
0*77 + 0*29. 
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The Fs all lie within the ranges of each other's probable errors so that it is 
not possible to say that one distribution is significantly better fitted than 
another. 

The experiment indicates, inter alia, that values of the partial correlation 
coefficient (as determined by samples of 30) as great as +0*5 may be expected 
when the true value is zero, about 4 times per 1000, and further that the 
dispersion of partial correlation coefficients for p = and n = 30 is not 
significantly different from that evaluated according to the co-operative study 
(loc. cit.) referred to above. 

The results obtained above are of considerable practical importance. The 
partial correlation coefficient has been recently used in the analysis of many 
types of data. In the current psychological controversy* as to the existence 
of a central factor common to all forms of intellectual activity the method of 
partial correlation has been used in respect of groups of individuals of the 
size now under consideration. Considerable use has also been made of the 
partial correlation coefficient in analysing vital statistics and to refer to only 
one other sphere of application of the method a recent pamphletf published 
by the Meteorological Office indicates the extent to which total and partial 
correlations have been used in analysing weather data. 

A further experimental investigation is respect of samples from a highly 
correlated population has been in progress concurrently with the above and 
it is hoped to publish the results in due course as Part II of this investigation. 

The numerical work in both cases has been extremely laborious.^ In 
conclusion I beg to state that the investigation has been conducted under the 
general supervision formerly of Dr. E. 0. Snow and latterly of Captain 
M. Greenwood and owes much to their help and advice. 

* Prof. Spearman, * American Journal of Psychology,' vol. 15, p. 284 ; Cyril Burt, 
'British Journal of Psychology,' vol. 3, pp. 94-177 ,• W. Brown, "The Essentials of 
Mental Measurement," c Camb. Univ. Press,' 1911 ; Dr. E. Webb, " Character and 
Intelligence," * Brit. Jour. Psychology Monograph/ 1915 ; G. H. Thomson and 
J. C. M. Garnett, ' Brit. Jour, of Psychology,' vol. 9, p. 321, et seq. ; J. C. Maxwell 
Garnett, 'Boy. Soc. Proc.,' Series A, vol. 96 (1919); Godfrey H. Thomson, 'Koy. Soc. 
Proc.,' Series A, vol. 95, 1918. 

t M.O. 223, 'The Computer's Handbook/ sec. V, 3. A Collection of Correlation 
Coefficients from Meteorological Papers, and a Note on the Partial Correlation 
Coefficient. By Captain E. H. Chapman, RE. 

X I am especially indebted to Miss A. Fowler and to Mr. M. E. Wilson for help 
in this respect both for original working and for checking. Most of the original 
drawings were made by school children. 



